Multilingual Data Selection for Low Resource Speech Recognition
نویسندگان
چکیده
Feature representations extracted from deep neural networkbased multilingual frontends provide significant improvements to speech recognition systems in low resource settings. To effectively train these frontends, we introduce a data selection technique that discovers language groups from an available set of training languages. This data selection method reduces the required amount of training data and training time by approximately 40%, with minimal performance degradation. We present speech recognition results on 7 very limited language pack (VLLP) languages from the second option period of the IARPA Babel program using multilingual features trained on up to 10 languages. The proposed multilingual features provide up to 15% relative improvement over baseline acoustic features on the VLLP languages.
منابع مشابه
Multilingual Recurrent Neural Networks with Residual Learning for Low-Resource Speech Recognition
The shared-hidden-layer multilingual deep neural network (SHL-MDNN), in which the hidden layers of feed-forward deep neural network (DNN) are shared across multiple languages while the softmax layers are language dependent, has been shown to be effective on acoustic modeling of multilingual low-resource speech recognition. In this paper, we propose that the shared-hidden-layer with Long Short-T...
متن کاملRapid Update of Multilingual Deep Neural Network for Low-Resource Keyword Search
This paper proposes an approach to rapidly update a multilingual deep neural network (DNN) acoustic model for low-resource keyword search (KWS). We use submodular data selection to select a small amount of multilingual data which covers diverse acoustic conditions and is acoustically close to a low-resource target language. The selected multilingual data together with a small amount of the targ...
متن کاملCross-lingual and Multilingual Speech Emotion Recognition on English and French
Research on multilingual speech emotion recognition faces the problem that most available speech corpora differ from each other in important ways, such as annotation methods or interaction scenarios. These inconsistencies complicate building a multilingual system. We present results for crosslingual and multilingual emotion recognition on English and French speech data with similar characterist...
متن کاملImproved Multilingual Training of Stacked Neural Network Acoustic Models for Low Resource Languages
This paper proposes several improvements to multilingual training of neural network acoustic models for speech recognition and keyword spotting in the context of low-resource languages. We concentrate on the stacked architecture where the first network is used as a bottleneck feature extractor and the second network as the acoustic model. We propose to improve multilingual training when the amo...
متن کامل"multilingual" Deep Neural Network for Music Genre Classification
Multilingual deep neural network (DNN) has been widely used in low-resource automatic speech recognition (ASR) in order to balance the rich-resource and low-resource speech recognition or to build the low-resource ASR system quickly. Inspired by the idea of using multilingual DNN for ASR, we use a “multilingual” DNN (Multi-DNN) for music genre classification. However, we do not have “multilingu...
متن کامل